(19) 



J 



Europaisches Patentamt 
European Patent Office 
Office europ£en des brevets 



(12) 



(11) EP 0 587 437 B1 

EUROPEAN PATENT SPECIFICATION 



(45) Date of publication and mention 
of the grant of the patent: 
27.02.2002 Bulletin 2002/09 

(21) Application number: 93307149.0 

(22) Date of filing: 10.09.1993 



(51) mt CI 7: H03M 7/30, G11B 20/10, 
G11B 20/00 



(54) Data compression/decompression and storage of compressed and uncompressed data on a 
single data storage volume 

DatenkompressionAdekompression und Speicherung von komprimierten und nicht komprimierten 
Daten in einem einzigen Datenspeichervolumen 

Compression/decompression de donn^es et stockage de donn§es comprim^es et non-comprim6es 
sur un seul volume de stockage de donn^es 



(84) Designated Contracting States: 
DE FR GB 

(30) Priority: 11.09.1992 US 943613 

(43) Date of publication of application: 
16.03.1994 Bulletin 1994/11 

(73) Proprietor: International Business Machines 
Corporation 

Armonk, N.Y. 10504 (US) 

(72) Inventors: 

• Kulakowski, John Edward 
Tucson, AZ 8571 5 (US) 

• Means, Rodney Jerome 
Tucson, AZ 85715 (US) 

(74) Representative: Moss, Robert Douglas 
IBM United Kingdom Limited Intellectual 
Property Department Hursley Park 
Winchester Hampshire S021 2JN (GB) 



CO 

CO 

h- 
oo 
m 

o 

Q- 
LU 



(56) References cited: 
EP-A- 0 490 239 
US-A- 4 506 303 



WO-A-91/19255 



• IBM TECHNICAL DISCLOSURE BULLETIN, vol. 
22, no. 9, February 1980 NEW YORK, US, pages 
4191-4193, ANONYMOUS 'Data Compaction 
Storage System. February 1980.' 

• IBM TECHNICAL DISCLOSURE BULLETIN, vol. 
24, no. 7A, December 1981 NEW YORK, US, 
pages 3202-3203, ANONYMOUS 'Compacting 
Indexed Data. December 1981.' 

• IBM TECHNICAL DISCLOSURE BULLETIN, vol. 
33, no. 9, 1 February 1991 pages 158-164, XP 
000109445 'COMPRESSED SEQUENTIAL DATA 
ON A FIXED BLOCK DEVICE' 

• IBM TECHNICAL DISCLOSURE BULLETIN, vol. 
33, no. 9, February 1991 NEW YORK, US, pages 
1 58-1 64, ANONYMOUS 'Compressed Sequential 
Data On a Fixed Block Device.' 



Note: Within nine months from the publication of the mention of the grant of the European patent, any person may give 
notice to the European Patent Office of opposition to the European patent granted. Notice of opposition shall be filed in 
a written reasoned statement. It shall not be deemed to have been filed until the opposition fee has been paid. (Art. 
99(1) European Patent Convention). 



Printed by Jouve. 75001 PARIS (FR) 



EP 0 587 437 B1 



Description 

FIELD OF THE INVENTION 

5 [0001 ] This invention relates to data storage systems that are capable of storing both compressed and uncompressed 
data on one data storage volume and to data processing systems utilizing such data storage systems. This invention 
also relates to data storage systems that minimize wasted data storage space on a data storage volume while storing 
compressed data. 

w BACKGROUND OF THE INVENTION 

[0002] Many data storage media, such as data storage optical disks, have a so-called fixed block architecture (FBA) 
format. Such format is characterized in an optical disk by so-called hard sectoring the disk's single spiral track into a 
plurality of sectors. Everyone of the sectors have identical data storage capacity, i.e. 512 bytes, 1024 bytes, 4096 

15 bytes, etc. Because of the FBA disks and the variability of data lengths of compressed data with respect to the source 
uncompressed data, in-line data compression has not been employed with FBA formatted disks. It is desired to effi- 
ciently store and enable simple random address accessing of a variable amount of compressed data resulting from 
compressing data which has been formatted into addressable blocks. Such compressed data can then be recorded 
on a FBA formatted disk. If the sector data does not compress to fewer bytes, then the data are stored without data 

20 compression on the data storage disk. 

[0003] It is also desired when a plurality of addressable data blocks is segmented into a plurality of groups of such 
data blocks, to maintain host processor addressability of the compressed data blocks within each compressed group 
of data blocks. It is also desired when compressing data for storage on a FBA storage medium to maintain a maximal 
addressability of all unused data storing sectors even though the number of sectors required to store the compressed 

25 data blocks is unknown. A further desire is to provide for random addressing of the compressed data blocks recorded 
in an FBA formatted storage medium. 

[0004] The data pattern randomness of most input data streams and the variability in the resulting length of the 
compressed data output after the application of the various compression algorithms, does not allow for the prediction 
of the amount of storage space required to contain the compressed data. This situation requires a link between the 
30 transmission of the data stream to be compressed and recorded and the results of the compression process to assist 
the host processor in its storage management process. 

[0005] The function of updating a data file in this environment can not use any usual data updating process (read, 
update, write back) because the data pattern as a result of the update may not compress to the same degree as the 
original data block and therefore updated compressed data most probably will not fit in the original storage space 

35 required to store the original data. 

[0006] In a fixed block architecture (FBA) environment, data are recorded on a data storage medium in fixed sized 
units of storage called sectors where each recording track on the medium contains a fixed number of such sectors. 
The addressing convention for optical disk devices consists of a track address on the medium and a sector number of 
the particular track. On optical media storage devices, each of the sectors consists of two major parts; an Identification 

^0 field (ID) used by the device controller to locate a particular sector by a physical address and a data field for storing 
data. The informational content of the ID'S on hard sectored optical disks are indelibly recorded, as by a stamping/ 
molding process, on the medium at the time of manufacture. Other data storage formats also are usable to practice 
the present invention, such as the known count-key-data (CKD) and extended count-key-data (ECKD) formats used 
on many magnetic disk media. 

^5 [0007] An FBA device attached to a host via the known Small Computer Standard Interface (SCSI) must provide the 
capability to resolve a Logical Block Address (LBA) used by SCSI architected direct-access data storage devices to 
address fixed sized units of storage to a unique physical address (track and sector) on the medium. The SCSI attached 
FBA device provides to the host a contiguous address space of N (N is a positive integer) storage locations which can 
be accessed for reading or writing in any sequence. Each LBA directory structure (addresses ranging from 0 to N) is 

so the addressing mechanism used to store and retrieve data blocks in the SCSI-FBA environment (some FBA devices 
also provide the capability to address the storage space using the physical address). 

[0008] As can be seen from the preceding paragraphs, the principal problem facing a designer of a storage system 
using data compression techniques in the SCSI-FBA environment is to provide a mechanism by which fixed size units 
of data, herein termed data blocks, in an input data stream can be recorded in a variable amount of storage medium 
55 space and still maintain addressability to the unoccupied storage space and provide for addressability to the recorded 
data blocks. 

[0009] Since many optical disks today are of the removable type, it is further desired to enable each removable data 
storage medium to be self-describing as to compressed and uncompressed data held thereon. 
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DISCUSSION OF THE PRIOR ART 

[0010] The Vosacek US patent 4,499,539 shows first allocating a number of data storage segments of a cache or 
buffer for storing a maximum number of data bytes that are storable in an addressable track of a direct access storage 

5 device (DASD) connected to the cache or buffer. The DASD is a magnetic disk storage device. The protocol is to stage 
or transfer one track of DASD data to the cache or buffer in one input-output operation (one access to the DASD). 
Upon completion of the actual data transfer, the cache or buffer is examined. If less than all of the first allocated 
segments contain data, then the empty allocated segments are deallocated. Pointers are recorded in a first one of the 
allocated segments for pointing to additional allocated segments that store data from the same DASD track. In this 

w manner the DASD track is emulated in the cache or buffer. 

[0011] US Patent No 5097261 (application No USSN 07/441 ,126) shows a data compaction system for a magnetic 
tape peripheral data storage system. Tapes do not have any addressable data storage areas. The entire tape is for- 
matted each time it is recorded. This formatting feature in magnetic tapes enables storing variably sized records as 
variably sized blocks of data. The storage of uncompressed and compressed data is by addressable blocks of such 

is data. The application does show including a plurality of records in one block of data recorded on the tape. Co-pending 
commonly-assigned US patent No 5 200 864 (application USSN 07/372,744, filed 6/28/89, (Attorney docket 
TU989003)) shows a magnetic tape data storage system that automatically stores a plurality of small records in each 
block of recorded data. Each of the records remain individually addressable. A purpose of combining a plurality of 
records in one block is to reduce the number of inter-block gaps for increasing the storage capacity of the magnetic tape. 

20 [0012] Data compression and decompression algorithms and systems are well known. US patent 5109226 shows 
an in line (real time) data compression/decompression system for use in high speed data channels. This system uses 
an algorithm shown in the Langdon, Jr. et al US patent 4,467,317. Batch processed (software) data compression and 
decompression is also well known. PKWARE, Inc., 7032 Ardara Avenue, Glendale Wl 53209 USA provides the software 
programs PKZIP for batch compression, PKUNZIP for batch decompression among other compression-decompression 

25 software. Another data compression-decompression algorithm has been used for both batch (software processing) 
and in-line (hardware-integrated semiconductor chips) processing. The known Lempel Ziv-1 data compression/decom- 
pression algorithm is used for both in-line (real time) and batch data compression and decompression. It is preferred 
to use the latter algorithm. Shah and Johnson in the article DATA COMPRESSOR DECOMPRESSOR IC in the "1990 
IEEE International Symposium on Circuits and Systems, New Orleans LA USA (pp 41-43) on May 1-3, 1990 describe 

30 an integrated circuit using the known Lempel-Ziv algorithm mentioned above. In practicing the present invention, it is 
preferred that a compression-decompression algorithm that facilitates both batch and in line operations be used. Of 
course, only batch or only in line data compression-decompression may be used to successfully practice the present 
invention. 

[0013] Images or "non-coded" data have been compressed and decompressed for saving data storage space. Re- 

35 itsma US patent 4,622,585 shows one video compression scheme. 

[0014] W091/1 9255 discloses storing compressed data to disk, using a logical block size smaller than the physically 
formatted block size. A user sets an estimated compression ratio which determines a fixed logical block size from the 
size of physical disk sectors, and compressed data is stored in the logical block. When a block of compressed data is 
too large to fit within a single logical block, an overflow condition occurs and the overflow data is stored in other physical 
blocks. A table stores information linking the blocks that contain data from a compressed data block. 
[0015] EP-A-0490239 discloses a random access storage device, formatted to provide multiple predefined partitions 
with different block sizes. The data to be stored is in blocks of fixed size, and these blocks are compressed if the 
compressed size fits in the block size of a small partition in the storage device. If a data block is not compressible to 
the small block size, it is stored uncompressed in another of the partitions. The memory device contains a table storing 

^5 the locations of the blocks in the partitions. 

[0016] US 4,506,303 discloses an optical data recording system and method for receiving an input data stream to 
be recorded, dividing the input data stream into sections, compressing each section into a period shorter than the 
original section, providing data gaps between each compressed data section, and recombining the compressed data 
including data gaps into a gapped output stream, and recording the gapped output data. This enables accurate data 

so retention when using butted or staggered CCD arrays. 

SUMMARY OF THE INVENTION 

[0017] The present invention provides flexible data compression-decompression controls that enable randomly ac- 
55 cessing compressed data through relatively simple accessing mechanisms. 

[0018] According to a first aspect, the present invention provides apparatus for storing data in compressed form in 
a data storage device having a plurality of addressable like-sized data storage areas, each for recording a predeter- 
mined number of data bytes, the data storage device being connected to means for receiving data to be recorded, said 
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received data being arranged in a plurality of addressable data blocks, characterised in that the apparatus comprises, 
in combination: selection means in the means for receiving data for selecting one or more data transfer units of data 
blocks to be recorded, each said transfer unit of data blocks having a given number of data bytes and including one 
or more of said addressable data blocks; allocation means connected to the selection means for responding to the 

5 number of data bytes in each said transfer unit of data blocks to determine, based on said number of data bytes, a 
required first number of addressable data storage areas for storing said transfer unit of data blocks, and to indicate 
that said transfer unit requires said first number of addressable data storage areas to record said transfer unit; com- 
pression means connected to the selection means for compressing said transfer unit of data blocks to be recorded as 
a group of compressed data blocks; data access means in the device connected to said compression means for re- 

10 cording said group of compressed data blocks in a second number of said addressable data storage areas as one 
continuum of compressed data, said second number being equal to or less than said first number; and directory means 
indicating which ones of said addressable data storage areas said continuum of data is recorded in, and indicating that 
said continuum of data contains said selected transfer unit of data blocks in a compressed form. 
[0019] In a second aspect of the present invention there is provided a method of compressing and recording onto a 

15 data storage medium data of a file which is arranged in a plurality of addressable data blocks, the method comprising 
the steps of: selecting a plurality of said data blocks of said file to be compressed and recorded; segmenting the selected 
plurality of addressable data blocks into one or more data transfer units, each data transfer unit of data blocks having 
a given number of data bytes and including one or more of said addressable data blocks; allocating a first number of 
addressable data storage areas of the storage medium for recording each of said one or more data transfer units as 

20 respective separate groups of compressed data blocks, said first number being determined based on the number of 
data bytes in each of said one or more data transfer units; compressing each of said one or more data transfer units 
and recording them as respective separate groups of compressed data blocks in a second number of addressable 
data storage areas of the storage medium, said second number being equal to or less than said first number; creating 
and maintaining a file directory indicating the address and size of each of said recorded groups for enabling random 

25 access to recorded data within said file of data blocks. 

[0020] Preferably, the file directory provides information for addressing each of the compressed data blocks within 
a group. It is further preferred that the directory is maintained in a host processor and is also stored on the data storage 
medium containing the group of compressed data blocks. 

[0021] In a preferred embodiment of the present invention, a data file having a plurality of addressable data blocks 
30 is segmented into a plurality of groups of such data blocks. Each group of data blocks is separately compressed and 
decompressed as one unit of data. Each such group is separately transmitted between a host processor and a data 
storage unit, communications link, etc as one data transfer unit (DTU). The size of the DTU, in terms of the number of 
data blocks to be included, is determined empirically based upon the data storage capacity of (number of data bytes 
storable in) sectors into which a data storage volume is divided, the number of bytes in each of the data blocks of the 
35 data file, and other system parameters. The data storage of each group in compressed form in a data storage device 
is described by the data storage system to the host processor, preferably by a command linked to the host processor 
command effecting the data storage in compressed form. The host processor establishes a directory describing the 
storage of each and every group of the data file. If the data file is transferred to another system or host processor in 
the compressed form, the compressed data file directory accompanies the compressed groups. Retrieving compressed 
^0 data from a data storage device is by retrieving the group of data blocks having the data block(s) desired to be read. 
Each compressed group of data blocks is transferrable between host processors and data storage units without de- 
compression. The DTU or group-receiving data storage medium may be formatted in the well known fixed-block ar- 
chitecture (FBA) format, the well known count-key-data (CKD) format, the well known extended count-key-data (ECKD) 
format or any other format. 

45 [0022] Embodiments of the present invention will now be described in more detail, with reference to the accompa- 
nying drawings in which: 

Fig. 1 is a flow chart illustrating data storing operations according to an embodiment of the present invention; 

so Fig. 2 is a simplified block diagram of a data processing system in which the data storing operations according to 

Fig. 1 may be advantageously employed; 

Fig. 3 is a diagrammatic representation of a Logical Block Address (LBA) directory for identifying recorded com- 
pressed groups of data blocks of a data file; 

55 

Fig. 4 is a block diagram showing details of an optical data storage system attached to a host processor such as 
is shown in Fig. 2; 
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Fig. 5 is a block diagram of a peripheral controller usable in data processing systems such as are shown in Fig's 
2 and 4; 

Fig. 6 diagrammatically illustrates storing a compressed group of data blocks according to the steps of Fig. 1 ; 

5 

Fig. 7 diagrammatically illustrates host processor commands using a SCSI connection to a data storage system 
such as is shown in Fig. 2 or Fig. 4; 

Fig. 8A diagrammatically illustrates a file directory of a plurality of compressed groups of data blocks of a file 
w according to an embodiment of the present invention; 

Fig. 8B diagrammatically illustrates the format of a disk sector according to an embodiment of the present invention; 

Fig's 9-13 are flow charts showing details of the operation shown in Fig. 1 ; 

15 

Fig. 14 is a logic diagram illustrating an application of the present invention to a multi-unit data processing system 
that has a plurality of data storage devices and host processors interconnected by a data link or local area network; 
and 

20 Fig. 1 5 is a flow chart showing machine operations that update a compressed data file according to an embodiment 

of the present invention. 

DETAILED DESCRIPTION 

25 [0023] Referring now more particularly to the appended drawings, like numerals indicate like parts and structural 
features in the various figures. A data file having a plurality of data blocks is divided into one or more transfer units of 
data blocks. Before data storage, each transfer unit of data blocks is subjected to its own data compression cycle to 
create a group of compressed data blocks. The size of the data transfer unit, in bytes, is selected to be facile for 
addressing and retrieving individual recorded groups of compressed data blocks while providing good channel utiliza- 

30 tion and compression efficiency. Also the data transfer unit size is selected in part based upon data storage efficiency, 
i.e. the storage of the data, after compression, should fill several allocated addressable data storage areas. Each of 
the allocated sectors in each group is filled to capacity except the last sector of a group that may be partially filled. It 
is desired to reduce the number of partially filled data storage sectors for more efficiently filling the FBA data storage 
disk with data. This desire is balanced with enabling efficient random access to the compressed data blocks stored on 

35 the FBA data storage disk. 

[0024] Each stored or recorded group of compressed data blocks is accessed from disk 30 as a single data unit 
irrespective of the number of disk 30 sectors in which the group is recorded. Since each group of compressed data 
blocks is compressed in a separate data compression operation, all of the data in each such group must be decom- 
pressed starting with the beginning, i.e. first compressed bytes, in each group. Therefore, in randomly accessing a 

40 compressed desired data block in a given group, all of the compressed data blocks of each stored group are read from 
disk 30 as a single disk record. The single disk record is decompressed up to the desired or addressed compressed 
data block. The desired compressed data block is then decompressed for processing. Limiting the size of the groups 
of compressed data blocks provides for quicker access to any desired compressed data block. This desire is balanced 
with a desire to maximize utilization of the disk 30 data storage space. An example of managing these two parameters 

45 for creating a facile size group of compressed data blocks (that varies with each application) is described later. 

[0025] In an alternate arrangement, each data block is separately compressed. A plurality of such separately com- 
pressed data blocks are combined into a single disk record. The byte position within the single disk record for each of 
the separately compressed data blocks is recorded in the single disk record. Such byte position or offset enables 
addressing each of the compressed data blocks within a group. 

so [0026] To facilitate access to the groups of compressed data blocks, the host processor program maintains a directory 
that identifies the addressable data storage areas containing the group as well as the data blocks in the respective 
groups. This directory identification preferably takes the form of a file directory that is maintained in host processor 11 . 
Such directory is also stored on the volume or data storage disk containing the group(s) of compressed data blocks. 
Preferably, the directory is transmitted to the disk device as a part of each transfer of a compressed file having plural 

55 groups of compressed data blocks. This arrangement establishes on the FBA disk a directory that effects addressability 
of the compressed data blocks within the respective groups. 

[0027] Fig. 1 illustrates recording a data file by grouping a plurality of data blocks of the file into a smaller number 
of groups of compressed data blocks. Step 10 is executed in a host processor 11 (Fig. 2). A data file, or part of a data 
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file, is identified for compressed data storage. The data file consists of a plurality of data blocks. The term data block 
includes data records (coded data), sub-file structures, individual images, graphs and the like, drawings and other 
forms of graphics, combined graphics (non-coded data) and text(coded data), and the like. As later detailed, the data 
file is divided into facile sized groups of data blocks for transfer as a data transfer unit DTU to a storage unit or over a 
5 communication link and for maintaining a random access capability to the recorded groups of compressed data blocks. 
The size of each DTU and resultant recorded group is dependent on diverse variables, as will become apparent. 
Completion of one execution of step 10 results in one such group of data blocks being selected for compression and 
storage. 

[0028] Step 13 is executed by host processor 11 (Fig. 2). The number of uncompressed data bytes in the DTU of 
10 data blocks (the product of the number of data blocks times the number of bytes in each data block) is divided by the 
data storage capacity of one addressable data storage area (sector of an FBA formatted disk) and rounded to a next 
higher integer if the product includes a fraction. This number represents a maximum number of addressable data 
storage areas required to store the data; either uncompressed or if a compression does not compress the data into 
fewer bytes for storage. At this juncture, it is not known how many addressable data storage areas are required to 
15 store the group of data blocks after compression. To ensure that the group of data blocks is storable on the data storage 
medium (optical disk 30 is used in the illustrative embodiment), a number of the addressable data storage areas suf- 
ficient to store the entire group of compressed data blocks is initially determined for storing the group of data blocks 
in an uncompressed form. 

[0029] Step 15 is executed by both the host processor 11 and data storage system 12. The selected DTU of data 
20 blocks is transmitted by the host processor to the data storage system. The data compression of the selected DTU of 
data blocks is compressed before storage on the data storage medium 30 (Fig. 4). There are several methodologies 
that may be employed herein. The Fig. 1 indicated methodology requires the data storage system to allocate the 
maximum number of addressable data storage areas. Then the data transfer occurs requiring the data storage system 
to compress the selected DTU of data blocks just before the data are recorded on the data storage medium 30 (Fig. 
25 4). Upon completion of the compression and data storage or recording as one continuum of data, data storage system 
12 determines the number of addressable data storage areas actually used to store the compressed group of data 
blocks. The unused but allocated addressable data storage areas are then deallocated. In the event that certain data 
blocks compress to a greater number of bytes than the original or uncompressed data, then, as will become apparent, 
the data compression step is not used. Control data are recorded on the FBA disk that indicates which data are com- 
30 pressed and which data are not compressed. Such control data are used in retrieving data from the data storage (FBA) 
disk, as will become apparent. As later detailed in this specification, at step 16 data storage system 12 sends the 
storage locations of the just-recorded group of compressed data blocks to the host processor 11 for inclusion in a 
directory of the data file to which the recorded group of data blocks is a member. 

[0030] A second methodology has the data compression-decompression performed in host processor 11 . As such, 
35 host processor 1 1 includes the data compression mechanism, either software or hardware, and sends the compressed 
selected group of data blocks to data storage system 12 for storage. In this instance, if batch compression is used, 
host processor determines the number of addressable data storage areas required for storing the compressed group 
of data blocks. Host processor 11 then sends the required number of addressable data storage areas to data storage 
system 12 for allocation just before the compressed data are transmitted to the data storage system. 
40 [0031] In a third methodology, the uncompressed group of data blocks are transmitted twice by host processor 11 
to data storage system 12. A first transmission enables data storage system 12 to accurately measure the number of 
addressable data storage areas that will be required to store the compressed data. In the first transmission the data 
are compressed but not recorded. The number of compressed data bytes are counted to determine the data storage 
extent (number of sectors or addressable areas) for the compressed data. The data storage system 1 2 then allocates 
45 the indicated number of contiguous sectors for receiving and storing the compressed data. A second transmission of 
the same data to the data storage system 12 results in the compression and storage of the compressed data in a data 
storage medium. 

[0032] In each of the above described methodologies, if the number of bytes in the compressed file is greater than 
the number of uncompressed data bytes, then the data are recorded in the uncompressed form. Further, when updating 
so a group of compressed data blocks, the number of compressed data bytes may exceed the capacity of the currently 
allocated sectors. As described later with respect to Fig. 15, a change in allocation of sectors for storing the updating 
DTU may be required. 

[0033] Also, in each of the above described methodologies, the data blocks to be compressed and stored from each 
DTU are preferably compressed and stored as one group. That is, all data blocks in each DTU are compressed during 
55 one data compression cycle to produce one group of compressed data blocks. An alternate data compression approach 
is to individually compress each of the data blocks in each DTU. Then the group of compressed data blocks consists 
of a plurality of individually compressed data blocks. In the alternative data compression, a header in each group can 
identify the byte offset within each group of the individually compressed data blocks. Such individually compressed 
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data blocks may also be identified on the data recording disk by illegal recording code characters, such characters are 
well known for diverse data recording codes. 

[0034] Host processor 11 in step 19 logically associates all recorded groups of compressed data blocks via a later 
described file directory. When employing the above described first methodology, upon storing the compressed data, 
5 data storage system 12 reports to host processor 11 the actual number of sectors used to store the compressed data 
and further address and identifying data therefore, as will be described. 

[0035] At step 20, host processor 1 1 determines whether all of the data to be compressed and recorded have been 
recorded. The details regarding the recorded group of compressed data blocks (see step 19) have been entered into 
the later described file directory (Fig, 8A). If all of the above described machine operations have been completed, then 

10 the operation is "done", enabling exiting to other machine operations beyond the present description. Otherwise, steps 
10-19 are repeated as above described until all of the data have been compressed and recorded. It is to be noted that 
other machine operations may be performed by host processor 11 in a multi-tasking or interrupt driven data processing 
environment while steps 10-19 are in the process of execution as is known in the data processing art. 
[0036] Fig. 2 shows a data processing system in simplified form. Host processor 11 attaches a data storage system 

f5 12. Data storage system 12 includes a peripheral control 20 that connects host processor 11 to data storage device 
21. Device 21, in one embodiment of this invention, is a magneto-optical data storage device that operates with re- 
movable magneto-optical data storage media or a single medium (disk). As later used in this specification, the term 
programmed machine includes host processor 11 , peripheral controller 20 and programmed portions of data storage 
device 21 . The compression-decompression mechanisms are preferably in the programmed machine. For in-line com- 

20 pression-decompression, it is preferred that the compression-decompression occurs in the peripheral controller 20. 
As later described with respect to Fig. 1 4, the location of the compression-decompression mechanism can be anywhere 
in the programmed machine. For batch compression-decompression it is preferred to place the compression-decom- 
pression in host processor 11. 

[0037] Fig. 3 illustrates a logical block address (LBA) structure 23 used in magneto-optical disk data storage systems 
25 for addressing sectors of an optical disk. LBA 23 is a logical to real address translation mechanism that enables full 

advantage of practicing the present invention. This sector addressing is based upon the logical addressing found in 

many present day optical disk data storage devices. The attaching host processor 11 addresses data on disk 30 (Fig. 

4) using a logical block address included in LBA 23. LBA 23 determines which of the addressable physical data storage 

addressable areas, such as sectors, are addressed by the respective LBA address. In an alternate addressing ar- 
30 rangement, host processor 11 requests access to a named file. This alternate addressing arrangement includes host 

processor 11 identifying byte location within the file to begin a data operation and a number of bytes (byte length) to 

be subjected to the data operation, i.e. read from the disk, for example. 

[0038] LBA 23 is managed by either one of two algorithms. A first one has been used for optical disks. In this algorithm, 
the number of entries in LBA 23 is constant for each disk and is based upon the number of addressable entities in the 

35 disk designated for storing data. Spare addressable data storage areas or sectors are not included in the LBA 23 logical 
address sequence, as is known. Known secondary pointers enable addressing spare sectors via LBA 23. 
[0039] A second algorithm for addressing using LBA 23 is used in magnetic flexible diskettes . In this second algorithm, 
the address range of LBA 23 varies with the number of demarked or unusable sectors. LBA 23 identifies for addressing 
only the tracks and sectors that are designated for storing data. In the event one of the sectors identifiable by the 

to illustrated address translation becomes unusable, then the unusable or defective sector is skipped and replaced by 
another sector. Such substitution is well known. 

[0040] All of the addressable tracks and sectors on disk 30 are addressed via LBA 23. Such addressing is a table 
look up matching the host processor 11 supplied logical address to a physical disk track and sector storing the data 
identified by the supplied logical address. Each LBA logical address has one entry 14 in LBA 23. 

45 [0041] Numerals 17 and 18 indicate groups of compressed data blocks recorded on disk 30 using the present in- 
vention. Numeral 1 7 indicates the first group of compressed data blocks of one file. Numeral 1 8 indicates subsequently 
recorded groups of compressed data blocks from the same file. The enumeration of the data blocks in the recorded 
groups 17-18 is maintained in its original sequence as generated by host processor 11 . As will become apparent, the 
compressed data blocks in the respective groups are identified in a file directory shown in Fig. 8A. 

so [0042] A magneto-optic data storage drive or device 21 is illustrated in Fig. 4 as it is connected to host processor 1 1 
via peripheral controller 20. As usual, peripheral controller 20 is packaged with the optical disk drive. A magneto-optic 
record disk 30 is removeably mounted for rotation on spindle 31 by motor 32. A usual disk cartridge receiver (not 
shown) is in operative relation to spindle 31 for inserting and ejecting magneto optical or other optical disks 30 into 
and from drive 21. Optical portion 33 of drive 21 is mounted on frame 35. A headarm carriage 34 moves radially of 

55 disk 30 for carrying an objective lens 45 from track to track. A frame 35 of recorder suitably mounts carriage 34 for 
reciprocating radial motions. The radial motions of carriage 34 enable access to any one of a plurality of concentric 
tracks or circumventions of a spiral track for recording and recovering data on and from the disk. Linear actuator 36 
suitably mounted on frame 35, radially moves carriage 34 for enabling track accessing. The recorder is suitably attached 
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to one or more host processors 11, such host processors may be control units, personal computers, large system 
computers, communication systems, image signal processors, and the like. Attaching circuits 38 provide the logical 
and electrical connections between the optical recorder and peripheral controller 20. 

[0043] Device microprocessor 40 controls device 21 including the attachment circuits connected to peripheral con- 

5 troller 20. Control data, status data, commands and the like are exchanged between attaching circuits 38 and device 
microprocessor 40 via bidirectional bus 43. Included in micro-processor 40 is a program or microcode-storing, read- 
only memory (ROM) 41 and a data and control signal storing random-access memory (RAM) 42. 
[0044] The optics of the recorder(drive or device) 21 include an objective or focusing lens 45 mounted for focusing 
and radial tracking motions on headarm 33 by fine actuator 46. This actuator includes mechanisms for moving lens 45 

10 toward and away from disk 30 for focusing and for radial movements parallel to carriage 34 motions; for example, for 
changing tracks within a range of 100 tracks so that carriage 34 need not be actuated each time a track adjacent to a 
track currently being accessed is to be accessed. Numeral 47 denotes a two-way light path between lens 45 and disk 30. 
[0045] In magneto-optic recording, magnetic bias field generating coil 48. In a constructed embodiment electromag- 
net provides a weak magnetic steering or bias field for directing the remnant magnetization direction of a small spot 

is on disk 30 illuminated by laser light from lens 45. The laser light spot heats the illuminated spot on the record disk to 
a temperature above the Curie point of the magneto-optic layer (not shown, but can be an alloy of rare earth and 
transitional metals as taught by Chaudhari et al., USP 3,949,387). This heating enables magnet coil 48 generated bias 
field to direct the remnant magnetization to a desired direction of magnetization as the spot cools below the Curie point 
temperature. Magnet coil 48 is shown as supplying a bias field oriented in the "write" direction, i.e., binary ones recorded 

20 on disk 30 normally are "north pole remnant magnetization". To erase disk 30, magnet coil 48 supplies a field so the 
south pole is adjacent disk 30. Magnet coil 48 control 49 is electrically coupled to magnet coil 48 over line 50 to control 
the write and erase directions of the coil 48 generated magnetic field. Microprocessor 40 supplies control signals over 
line 51 to control 49 for effecting reversal of the bias field magnetic polarity. 

[0046] It is necessary to control the radial position of the beam following path 47 such that a track or circumvolution 
25 is faithfully followed and that a desired track or circumvolution is quickly and precisely accessed. To this end, focus 
and tracking circuits 54 control both the coarse- actuator 36 and fine actuator 46. The positioning of carriage 34 by 
actuator 36 is precisely controlled by control signals supplied by circuits 54 over line 55 to actuator 36. Additionally, 
the fine actuator 46 control by circuits 54 is exercised through control signals travelling to fine actuator 46 over lines 
57 and 58, respectively for effecting respective focus and track following and seeking actions. Sensor 56 senses the 
30 relative position of fine actuator 46 to headarm carriage 33 to create a relative position error (RPE) signal. Line 57 
consists of two signal conductors, one conductor for carrying a focus error signal to circuits 54 and a second conductor 
for carrying a focus control signal from circuits 54 to the focus mechanisms in fine actuator 46. 
[0047] The focus and tracking position sensing is achieved by analyzing laser light reflected from disk 30 over path 
47, thence through lens 45, through one-half mirror 60 and to be reflected by half-mirror 61 to a so-called "quad detector" 
35 62. Quad detector 62 has four photoelements which respectively supply signals on four lines collectively denominated 
by numeral 63 to focus and tracking circuits 54. Aligning one axis of the detector 62 with a track center line, track 
following operations are enabled. Focusing operations are achieved by comparing the light intensities detected by the 
four photoelements in the quad detector 62. Focus and tracking circuits 54 analyze the signals on lines 63 to control 
both focus and tracking. 

40 [0048] Recording or writing data onto disk 30 is next described. It is assumed that magnet 48 is rotated to the desired 
position for recording data. Microprocessor 40 supplies a control signal over line 65 to laser control 66 for indicating 
that a recording operation is to ensue. This means that laser 67 is energized by control 66 to emit a high-intensity laser 
light beam for recording; in contrast, for reading, the laser 67 emitted laser light beam is a reduced intensity for not 
heating the laser illuminated spot on disk 30 above the Curie point. Control 66 supplies its control signal over line 68 
to laser 67 and receives a feedback signal over line 69 indicating the laser 67 emitted light intensity. Control 68 adjusts 
the light intensity to the desired value. Laser 67, a semiconductor laser, such as a gallium-arsenide diode laser, can 
be modulated by data signals so the emitted light beam represents the data to be recorded by intensity modulation. In 
this regard, data circuits 75 (later described) supply data indicating signals over line 78 to laser 67 for effecting such 
modulation. This modulated light beam passes through polarizer 70 (linearly polarizing the beam), thence through 

so collimating lens 71 toward half mirror 60 for being reflected toward disk 30 through lens 45. Data circuits 75 are prepared 
for recording by the micro-processor 40 supplying suitable control signals over line 76. Microprocessor 40 in preparing 
circuits 75 is responding to commands for recording received from a host processor 11 via attaching circuits 38. Once 
data circuits 75 are prepared, data is transferred directly between peripheral controller 20 and data circuits 75 through 
attaching circuits 38. Data circuits 75, also ancillary circuits (not shown), relating to disk 30 format signals, error de- 

55 tection and correction and the like. Circuits 75, during a read or recovery action, strip the ancillary signals from the 
readback signals before supply corrected data signals over bus 77 to peripheral controller 20 via attaching circuits 38. 
[0049] Reading or recovering data from disk 30 for transmission to host processor 1 1 requires optical and electrical 
processing of the laser light beam from the disk 30. That portion of the reflected light (which has its linear polarization 
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from polarizer 70 rotated by disk 30 recording using the Kerr effect) travels along the two-way light path 47, through 
lens 45 and half-mirrors 60 and 61 to the data detection portion 79 of the headarm 33 optics. Half-mirror or beam 
splitter 80 divides the reflected beam into two equal intensity beams both having the same reflected rotated linear 
polarization. The half-mirror 80 reflected light travels through a first polarizer 81 which is set to pass only that reflected 

5 light which was rotated when the remnant magnetization on disk 30 spot being accessed has a "north" or binary one 
indication. This passed light impinges on photocell 82 for supplying a suitable indicating signal to differential amplifier 
85. When the reflected light was rotated by a "south" or erased pole direction remnant magnetization, then polarizer 
81 passes no or very little light resulting in no active signal being supplied by photocell 82. The opposite operation 
occurs by polarizer 83 which passes only "south" rotated laser light beam to photocell 84. Photocell 84 supplies its 

10 signal indicating its received laser light to the second input of differential amplifier 85. The amplifier 85 supplies the 
resulting difference signal (data representing) to data circuits 75 for detection. This detection, in the illustrated embod- 
iment, does not include digital demodulation (decoding the read back signals from a 1 -7 d-k code to data in a host 
processor format). The detected signals include not only data that is recorded but also all of the so-called ancillary 
signals as well. The term "data" as used herein is intended to include any and all information-bearing signals, preferably 

is of the digital or discrete value type. 

[0050] The rotational position and rotational speed of spindle 31 is sensed by a suitable tachometer or emitter sensor 
90. Sensor 90, preferably of the optical-sensing type that senses dark and light spots on a tachometer wheel (not 
shown) of spindle 31 , supplies the "tach" signals (digital signals) to RPS circuit 91 which detects the rotational position 
of spindle 31 and supplies rotational information-bearing signals to microprocessor 40. Microprocessor 40 employs 

20 such rotational signals for controlling access to data storing segments on disk 30 as is widely practiced in the magnetic 
data storing disks. Additionally, the sensor 90 signals also travel to spindle speed control circuits 93 for controlling 
motor 32 to rotate spindle 31 at a constant rotational speed. Control 93 may include a crystal -controlled oscillator for 
controlling motor 32 speed, as is well known. Microprocessor 40 supplies control signals over line 94 to control 93 in 
the usual manner. 

25 [0051] Peripheral controller 20 is shown in Fig. 5 This controller includes the compression-decompression mecha- 
nism for in-line or real time data compression-decompression. A connection between host processor 1 1 and peripheral 
controller 20 is effected by a SCSI module 100 that implements the known small computer system interface. An IO 
data buffer 1 03(dynamically allocated into input data buffers and output data buffers using known techniques) tempo- 
rarily stores data received from or to be transmitted to the host processor 11. An Optical Disk Controller (ODC) 104 

30 manages the reading and writing of the data to the disk 30 (Fig.4). Error Correction Control (ECC) module 1 06 detects 
and corrects errors in data being read and generates ECC error detection and correction redundancy characters to be 
written to the medium with the data. Run Length Limited (RLL) (mod-demod) encoding and decoding is performed in 
data circuits 75 (Fig. 4). Such mod-demod encodes and decodes recorded data patterns, such as used in the known 
1 -7 d-k code. Microprocessor 1 07 (plus control store 1 08 and dynamic store 1 09) controls the various elements of the 

35 controller 20. A Compression/Decompression (CD) module 101, such as an integrated circuit referred to by Shah et 
al, supra, implements the compression algorithms. CD module 101 includes automatic circuit timing and control, as is 
known, to control data flow through peripheral controller 20 under supervision of microprocessor 107. This compres- 
sion-decompression is in real time (in-line) with the data transfer. Busses 102, 110 and 111 interconnect the modules, 
as shown. Controller 20 is preferably packaged with a device 21 on a common frame. 

to [0052] Fig. 6 illustrates compression of several data blocks into one group of compressed data blocks recorded in 
a number of data storing sectors 118 of track 117 of disk 30. A group 115 of a plurality of data blocks 116 is selected 
for recording as described with respect to Fig. 1 . Group 115 of compressed data blocks is transmitted to controller 20 
by host processor 11. CD 101 in controller 20 compresses group 115 sufficiently to be recorded as a group of com- 
pressed data blocks in sectors 1 1 8 plus about one-half of sector 1 1 9. The remaining half of last sector 1 1 9 is filled with 

45 padding bytes, as is known. Numeral 122 indicates a sector that was allocated previously. Numeral 123 indicates a 
next sector(s) that were initially allocated according to the above-described first methodology. The linked response of 
controller 20 to the write-compress command indicates to host processor 11 that sector(s) 123 are to be deallocated 
as such sectors did not receive any of the data from group 1 1 5. Host processor 1 1 responds to controller 20 to deallocate 
sectors 123. 

so [0053] The above description assumes that host processor 1 1 is performing data space management. This arrange- 
ment is usual. It is to be pointed out that in a multi-host arrangement of sharing device 21 that one of the hosts may 
be designated to perform space management. Also, in some systems the peripheral controller performs data storage 
space management. 

[0054] Fig. 7 illustrates in abbreviated form three commands for use in a known SCSI interface. WRITE command 
55 130 includes the operation code field 131 that indicates the command is a WRITE command. LBA address field 132 
indicates the first LBA address that data being transmitted in accordance with the instant WRITE command is to begin 
(the lowest LBA address of possibly several LBA addresses required to be used in storing data into a plurality of disk 
30 sectors). Field 1 33 indicates the number of units of data that are to be transferred from host processor 11 to device 
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21 for storage on disk 30. One unit is that data storable in one sector of the disk 30. FBA disks may have different data 
storing capacity sectors, such as 51 2, 1 024 (1 kb), 2048, or 4096 bytes of data. Field 1 34 indicates whether or not the 
data to be transmitted is to be compressed. Field 135 indicates that this WRITE command is linked to read buffer 
command 1 40. This command linkage requires peripheral controller 20 to report to host processor 1 1 the details of the 

s data storage, i.e. number of sectors actually used, the data that enables host processor 11 to build an entry for the 
later described Fig. 8A illustrated file directory, and identifies the sectors to be deallocated. It is noted that LBA 23 is 
updated in host processor 11 with a copy thereof recorded in a sector of disk 30. Also, a copy of the Fig. 8A illustrated 
file directory is recorded on disk 30, preferably in a uncompressed form at a first LBA 23 logical address that immediately 
precedes the first LBA address for storing compressed data. 

w [0055] Read buffer SCSI command 140 includes operation code field 141 that indicates the command is a READ 
BUFFER command. Controller 20 responds to receipt of a READ BUFFER command to transfer data from an output 
register(s) of IO buffer 103. Controller 20 stores the information relating to storing a group of compressed data blocks 
in such output buffer 103 register(s) in preparation to respond to the READ BUFFER command linked to the WRITE 
command 130. Field 142 indicates to controller 20 the number of sectors used to store the compressed data blocks. 

is That is, host processor 11 knows the number of disk sectors required for storing the compressed data blocks, hence 
the new entry for the Fig. 8A illustrated file directory. 

[0056] READ DATA command 145 has operation code field 146 having an indication that the command is a READ 
DATA command. The first LBA address to be used for transferring data from disk 30 to host processor 11 is indicated 
in field 147. Field 148 indicates the number (n) of data blocks requested or commanded to be transferred from the FBA 

20 disk to the host processor 11 . Field 149 indicates to controller 20 the number (N) of disk sectors that are to be read. 
Field 1 50 indicates that decompress is either on or off. Link on bit 151 is usually reset to be inactive. For reading one 
group of compressed data blocks, controller 20 reads the indicated number (N) of sectors, decompresses the data 
blocks, then transfers the decompressed data blocks to host processor 11. Controller 20 counts the number of data 
blocks transferred such that when the indicated number n of field 148 is reached, the data transfer is terminated. The 

25 data block counting is also used as an integrity check. 

[0057] The Fig. 8A illustrated file directory can indicate different levels of detail, the selected level is application 
dependent. Every file that has data blocks recorded in groups of compressed data blocks has a separate portion of 
the directory respectively indicated by numerals 161 , 162 and 163 for three different data files. Each row 160 of each 
directory represents one entry. A first entry in each directory include in column 164 the filename of the file and the LBA 

30 address at which the directory is recorded on disk 30. Column 165 in the first or top most entry indicates the number 
of data blocks in each data transfer unit. The term data transfer unit (DTU) indicates that a given number of data blocks 
are to be transferred between disk 30 and host processor 11 during each data transfer. The remaining entries 160 are 
respectively for the transmitted and recorded groups of compressed data blocks. Again, column 164 in the respective 
entries indicates the first LBA address used to store the group. Column 165 indicates the number of data blocks re- 

35 corded and the number of sectors used to store the respective groups of compressed data blocks on disk 30. Once 
all of the data blocks are compressed in a single data compress operation, the group of compressed data blocks are 
a continuum of data with no external indication of the data block boundaries. The decompression mechanism and 
associated controls identify the data block boundaries after decompression, as is known. 

[0058] In addition to the information contained in the Fig. 8A illustrated file directory, additional details of each group 
*o may be provided. In such an alternate implementation of the file directory, controller 20 returns, in addition, for each 
group of compressed data blocks (i.e. for each respective entry of the Fig. 8A illustrated file directory) a map of the 
relation of data blocks and data storing sectors (uses the LBA logical address, not the actual physical location on disk 
30) for each of the groups. This additional information is used by the host to manage the recorded data and unused 
disk 30 sectors indicated in LBA 23. 
45 [0059] All entries contain the above indicated mapping of data blocks to LBA addresses for each and every group 
(Gp.) of compressed data blocks in the current file. That is, each data block is indicated as being recorded in one or 
more sectors, depending on the compression and size of the data blocks. Several compressed data blocks may be 
recorded in one sector. In this instance, the LBA addresses are the same for starting and ending, i.e. LBA 10 to LBA 10 
for example could occur for several data blocks. 
so [0060] A format of the Fig. 8A illustrated directory using the additional addressing information is set forth below. 



First entry 


Filename 


Number of data blocks in a data transfer unit 


Second entry 


Gp. 1 LBA 


Number of data blocks and sectors in this group 




data block n 


LBA N at byte B 




data block n+1 


LBA N at byte B 2 




data block n+2 


LBA N 2 at byte B 3 
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(Map of all data blocks in group (Gp.) 1 continues, term "byte" indicates byte displacement of the respective compressed 
data block as recorded in a sector.) 



Third entry Gp. 2 LBA Number of data blocks and sectors 



10 



15 



20 



25 



30 



35 



40 



45 



50 



55 



(Map of data blocks to LBA addresses is set forth above) 

[0061] The super scripts merely indicate 1st (no super script), second, etc byte positions of the respective blocks n, 
n+1 , n+2 etc. 

[0062] The above described directory structures enable the data contents of a single group of compressed data 
blocks to be updated without the necessity of reading and then rewriting the entire file. An update of a group of com- 
pressed data block only requires the reading of one group of compressed data blocks. The update group of compressed 
data blocks may require more sectors for storage than that use to store the previous generation group of compressed 
data blocks. That is additional sectors have to be allocated. Since it is desired that each group of compressed data 
blocks are recorded in contiguous sectors (except of unaddressable intervening defective sectors), a new allocation 
may be required. All of this activity is explained later with respect to Fig. 15. Host processor 11 uses this information 
to determine the next available set of contiguous LBA addresses that have sufficient number of addresses (sectors) 
for storing the updated group of compressed data blocks. 

[0063] For WORM (write once, read many) optical disks, the host processor may issue a MEDIUM SCAN command 
to locate the next available LBA addressed sector for storing the updated group of compressed data blocks. Host 
processor 11 saves this information in an expanded directory entry for use when the data are to be retrieved or read. 
[0064] As later described with respect to Fig. 10, another control parameter is a minimum or maximum number of 
sectors to be used in the CKD and ECKD examples for practicing the present invention. The number N of sectors 
required to store the uncompressed data is compared with a MIN (minimum value) and a MAX (maximum value). If 
the number of required sectors is between the MIN and MAX values, then a DTU is made using the number N. MIN 
ensures a reasonable usage of disk storage space while MAX ensures a reasonable access to compressed data blocks. 
If N is greater than MAX, then N is made equal to MAX. If N is less than MIN, then N is made equal to MIN. The number 
of data bytes in a DTU is N*SB (SB is number of bytes storable in one sector) for FBA devices and N*DB (DB is number 
of data bytes desired for storing one data block) for CKD and ECKD devices. The number of bytes in a DTU is stored 
in the first or top entry 1 60 (Fig. 8A) of each file directory. As one variation, field 166 in each of the entries 1 60 contains 
a compress DTU indicating bit C. If C is unity, then the data represented by the respective entry 160 are recorded in 
a compressed form. If bit C is zero or nil, then the data are recorded on disk 30 without data compression. The com- 
pressed bit C may also be recorded in each and every sector storing data in accordance with the present invention. 
[0065] Fig. 8B diagrammatical ly illustrates format of a disk sector of an FBA disk. Sector 170 is in track 169 of disk 
30. Intersector gap 171 separates sector 170 from an immediately preceding sector (not shown). Sector ID 172 is an 
embossed area that contains the track and sector address of sector 1 70. Intrasector gap 1 73 separates the hard 
sectored or embossed mark 172 from the magneto-optically recorded portion that constitutes the remainder of sector 
170. Data synchronization signals DATA SYNC 174 are magneto-optically recorded with the data stored in portion 175 
of sector 1 70. Control area 1 76 stores magneto-optically recorded control signals, as may be desired. A compress bit 
C 177 (considered a part of the control signals in area 176) if set to unity indicates that the data in portion 175 are 
compress. If C 177 is set to zero or nil, then the data stored in portion 175 are not compressed. Sector 170 ends with 
the error detection and correction redundancy in ECC 178 portion. ECC 178 stored signals are generated and stored 
in a known manner that is not pertinent to an understanding of the present invention. Intersector gap 179 separates 
sector 1 70 from a next succeeding sector 1 80. It is preferred that compress bit 1 77 be used while practicing the present 
invention. 

[0066] Fig. 9 is a flow chart showing a sequence of machine operations for storing a file in a plurality of groups of 
compressed data blocks wherein each group is separately transmitted from a host processor to a data storage system 
as a DTU having a number of uncompressed bytes as set forth above. At step 1 85 the data to be recorded is analyzed 
for determining the number of DTU's to be generated. The actual size in bytes/data blocks of a DTU may be different 
from file to file. In step 1 86, the DTU size is modified to accommodate the number of data blocks to be initially recorded 
for equalizing the sizes of a plurality of DTU's to be used. For example, if the number of data blocks to be compressed 
and recorded is less than two desired DTU's and one half of the number of data blocks results in a number of data 
bytes greater than MIN, then two DTU's each having one-half of the data blocks are created. This same principle is 
applied to transferring data blocks having any number of DTU's except for updating a recorded group of compressed 
data blocks, as will become apparent. If the DTU sizes cannot be equalized, then a last DTU may have a number of 
bytes less than the MIN (minimum) number of bytes. Upon updating the recorded group of compressed data blocks 
resulting from a small last DTU, a DTU is generated that adds a number of data blocks to make the size of the DTU, 
hence group of compressed data blocks, larger to meet the DTU size requirements set forth with respect to Fig's 9 
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and 11. Fig. 15 relating to updating a recorded group of compressed data blocks illustrates machine steps for storing 
an updated DTU that is too large for the current allocated data storage space for a recorded group of compressed data 
blocks resulting from compressing and storing the updated DTU. 

[0067] Following "GET DTU" step 1 86 (Fig. 9), a DTU of data blocks is built for data transfer. Step 1 89 transfers the 
s DTU to data storage system 12. Data storage system 12 compresses and stores the transferred DTU as described 
earlier. At step 190, host processor 11 ascertains whether another DTU is to be transferred. If not (DONE = 1), then 
host processor 11 exits for performing other work not related to practicing the present invention. Otherwise, steps 188 
and 189 are repeated until all DTU's have been transmitted to data storage unit 12. 

[0068] Fig. 1 0 is a flow chart showing selecting a MIN and a MAX value respectively for image (non-coded or graphics) 
10 data and text (coded) data. The compressibility of data is a measure for selecting MIN and MAX. In this regard, each 
file of image or text data may compress substantially different from data from other files as well as changing from data 
block to data block in either type of data, image or text. Once a first group of data blocks have been compressed and 
recorded as a group of compressed data blocks, the compression ratio may be recorded in the Fig. 8A illustrated file 
directory as a reference for subsequent compression and storage of data blocks. The Fig. 10 illustration assumes that 
is the image data has been compressed 75% (compressed image data blocks are 25% of original size) and text data 
blocks have been compressed about 50%. These measured values may be changed for calculation purposes for adding 
a margin of error accommodation into the calculations. 

[0069] Step 1 95 determines whether the data in the file is text or image. If image, step 1 96 calculates the MIN value 
as 4*SB (bytes in a sector), i.e. at least four sectors are to be used for storing a group of compressed data blocks. The 

20 number four is selected in an arbitrary manner. Sector size affects the minimum number of sectors to be used. Step 
197 calculates MAX as being 64*SB. In a FBA disk having 1024 byte sectors, then the maximum DTU size is 64 KB 
(Kilobytes). Again, system considerations may change these values. Such considerations are beyond the present 
description. From step 195, for text data (IMAGE DATA = NO), step 200 calculates MIN as 2*SB while step 201 cal- 
culates MAX as 32*SB. The number of uncompressed bytes for image data in MIN and MAX is equal to the number 

25 of uncompressed bytes for text data. The different compression ratios change MIN and MAX values inversely to the 
expected compression ratio. Upon completing either calculation, host processor 11 stores the MIN and MAX values in 
the first entry 160 (Fig. 8A) of the appropriate file directory and then exits the calculation. 

[0070] The MIN and MAX values may also be predetermined and included as parameter data defining a class of 
data as set forth in Gelb et al US patent number 5,018,060 titled "ALLOCATING DATA STORAGE SPACE OF PE- 
30 RIPHERAL DATA STORAGE DEVICES USING IMPLIED ALLOCATION BASED ON USER PARAMETERS". Gelb et 
al teach that data set parameters implicitly control peripheral data storage operations. Such implicit control based on 
data base or file parameter data may be applied to practicing the present invention. 

[0071] Fig. 11 shows execution of a WRITE command by data storage system 12 wherein the data blocks received 
in on DTU are compressed then recorded as a group of compressed data blocks. Step 21 0 receives a WRITE command 

35 1 30. Step 21 1 sets the link commanded in field 1 35 for reporting the actual number of sectors used to store the resultant 
group of compressed data blocks and a compression ratio CR achieved. Step 212 sets a compress mode in data 
storage system 1 2 for activating CD 1 01 to compress the data blocks being received into one continuum of compressed 
data. Step 213 receives, compresses and stores the DTU data blocks. Step 216 compares the number of sectors 
actually used to store the compressed data with the number of sectors initially allocated. Step 21 7 compares the byte 

40 count of the original data blocks in the received DTU with the byte count of the compressed data blocks. In most 
instances, the byte count of the compressed data blocks will be less than the byte count of the original DTU data blocks. 
In this instance, at step 218, data storage system 12 indicates to host processor 11 that the data storage operation 
has been completed. The identification of any unused sectors plus other information describing the just-completed 
data recording operation is to be transferred from data storage system 1 2 to host processor 11 . This transfer is effected 

45 by host processor 11 responding to the indication of a completed recording operation by issuing a READ BUFFER 
command 1 40 to data storage system 1 2 to send the number of unused allocated sectors and all other compression 
information to host processor 11 . Host processor 12 in step 219 responds to the indication of unused allocated sectors 
to deallocate such sectors for use in storing other data. Note that if the compress bit 134 is off, then no compression 
occurs. 

so [0072] If at step 217, it is determined that the data compression resulted more data bytes in the compressed data 
blocks than were in the original data blocks, then the data blocks will be recorded without data compression. This 
growth in size of the compressed data blocks may occur when the original data blocks have certain data patterns. In 
any event, at step 220, data storage system 1 2 sends a channel command retry (CCR) or its equivalent to host processor 
11 . CCR indicates that the DTU has to be retransmitted by host processor 11 to data storage system 12. That is, the 

55 increased in size of the DTU after compression is considered an error condition. The CCR indicates that a recording 
error has occurred. Host processor 11 responds to the CCR at step 221 by resending the DTU to data storage system 
11 . At step 222, data storage system 12 stores the DTU without data compression. The above-described operations 
are exited from either step 219 or 222. 
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[0073] Fig. 12 is a flow chart showing system operations for reading data. Host processor 11 in step 225 prepares 
to read data, i.,e. identifies the data blocks to be read. Host processor 11 then in step 226 searches for a file directory 
(Fig. 8A). Such file directory may be read from disk 30. If there is no file directory relating to compression, then the 
data are not compressed. Also, if the field 166 of the Fig. 8 illustrated directory for the identified group is zero, then 

5 that group is not compressed. Further, if data to be read are compressed and it is desired to decompress in a unit other 
the storing data storage system 12, step 226 directs host processor operations to read all identified data without de- 
compression via path 227. From path 227, a usual data recording operation not involving data compression is performed 
(not shown). Host processor 11 builds issues one READ command 145 for each of the recorded groups of compressed 
data blocks to be read. Depending on the desired read operation, field 150 or READ command will be set to indicate 

10 either decompress or no decompress OFF. Host processor 1 1 before sending the READ command 1 45 to data storage 
system 1 2 examines field 1 50 at step 226. If host processor 1 1 at step 226 finds that the data to be read are compressed 
and decompression is desired, then step 230 sets field 150 to compress ON. All of the groups of compressed data 
blocks having data blocks to be read are identified in step 231 via examination of the appropriate file directory 1 61 -163. 
Host processor 1 1 in step 232 then builds one or more READ commands 1 45 for reading the step 231 identified groups 

15 of compressed data blocks with decompression. The term build used above indicates that the appropriate control data 
are inserted into a READ command for commanding data storage system 1 2 to perform a desired read. Such command 
includes the number of LBA addressed sectors to be read as well as the logical address in LBA of a first one of the 
sectors. One READ command is sent by host processor 11 to data storage system 12 in step 232, there can be a 
number of READ commands sent for fetching a plurality of groups of record blocks. Data storage system 12 receives 

20 the READ command. At step 233, data storage system checks the sector compress bit of- the first sector storing the 
requested group to be read. If bit C 1 77 (Fig. 8B) is unity, then the data are compressed. Data storage system 12 then 
in step 234 reads the requested group including decompressing the data. It is to be noted, that if the READ command 
field 150 indicates decompression is OFF, then no decompression occurs even if bit C 177 is set to unity. On the other 
hand, if bit C 177 equals zero (data in the sector are not compressed), the at step 235 data storage system 12 reads 

25 and sends the read data without decompression to host processor 11. The Fig. 2 illustrated system exits the read 
operation for one group from either step 234 or 235. 

[0074] Fig. 1 3 illustrates operation of data storage system responding to a READ command 1 45. Step 236 receives 
the READ command. Step 237 checks the compress field 1 50. If the compress field indicates that decompress is ON, 
then C bit 1 77 of the sector being accessed is checked to ensure that the data to be read is in fact recorded and stored 

30 in a compressed form. Step 238 executes the READ command by decompressing the data being read if field 150 
indicates compression and C bit 177 is ON. If the field 150 indicates decompression if OFF, the data stored in the 
addressed sectors are transferred without decompression whether compressed or not. That is, in all cases, data storage 
system 12 transfers the data without decompression if field 150 indicates compress is OFF. This control enables trans- 
ferring data in either compressed or decompressed form. 

35 [0075] Fig. 14 illustrates one application of the invention in a system having linked host processors. Both batch and 
in line data compression/decompression are employed. Compression-decompression software modules 251 and 273 
provide batch data compression and decompression while integrated circuit chips (hardware compress decompress) 
253 and 272 provide in line (real time) data compression-decompression Two data processing systems 240 and 241 
are linked by data link 263. Link 263 may be a local area network (LAN), a data communication circuit or transfer of a 

40 removable data cartridge manually or via a library, mail etc between the two data processing systems. Host processor 

250 in system 240 has a software compress-decompress facility 251 , a transfer link facility 252 that involves no com- 
pression or decompression and an in-line hardware compress-decompress facility 253. Facilities 251-253 may be 
physically located in data processing system 240 in host processor 250 or as a part of a channel connection that 
includes logic switch 254 (programmed or hardware) connecting host processor 250 to facilities 251 -253. Dashed line 

45 255 indicates that switch 254 is programmingly controlled by host processor 250. A given data processing system may 
have only 1) batch compress facility 251 and link facility 252, 2) in-line facility 253 and link facility 252, 3) all facilities 

251 -253 or 4) either facility 251 or 253 may be located either in data storage system 262 or data link 263. 

[0076] The input-output (IO) connections from facilities 251-253 are effected by logic switch 260 that is programmingly 
controlled by host processor 250 as indicated by dashed line 261 . Switch 260 directs IO data flow between facilities 

so 251 -253 and a data storage system 262 or data line 263. 

[0077] Data processing system 241 is shown as being identical to data processing system 240. Data processing 
system 241 includes host processor 270 that may have a different computational arrangement and capability from host 
processor 250, logic switch 271 , facilities 272-274, data storage system 275 and switch 277 that selectively connects 
data processing system 241 to data link 263 to other systems and data processing system 240. 

55 [0078] Fig. 15 illustrates updating a recorded group of compressed data blocks. Host processor 11 in step 280 has 
updated data blocks and desires to update a file recorded in data storage system 12 as a plurality of groups of com- 
pressed data blocks. Step 281 compares the data length (number of uncompressed data bytes) of the updating DTU 
with the number of bytes in sectors currently recorded as one group to be updated. Host processor 11 also examines 
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the number of padding bytes in a last sector storing compressed data for estimating whether or not the updated data 
blocks are storable in the currently allocated sectors for the group(s) to be updated. 

[0079J At step 282 host processor 1 1 determines whether or not the updating DTU can be stored in currently allocated 
sectors or if more or different sectors should be allocated. That is, if the updating DTU has more data bytes than the 
currently recorded group, then additional sectors are allocated at step 288 (host processor 11 does the allocation). 
Such new sectors are preferably contiguous sectors that may not include any sectors containing the recorded group 
of data blocks to be updated. Following allocation step 288, the updating DTU is recorded at step 289. Then, host 
processor 1 1 at step 290 deallocates the sectors containing the group of data blocks to be updated. The Fig. 2 illustrated 
system then exits the updating operation from step 290. 

[0080] If, at step 282, the number of data bytes in the updating DTU is substantially equal to the number of bytes 
(uncompressed) of the recorded DTU, then the updating occurs at step 283 using the sectors currently storing the 
group to be updated. The Fig. 2 illustrated system then performs step 290 before exiting the updating operation. If the 
updating DTU has fewer bytes than the recorded group, then the updating DTU is recorded in sectors selected from 
the sectors containing the group to be updated. The sectors not used to record the updating DTU are deallocated at 
step 290. 

[0081] It may be decided that, independently of any data growth patterns, to always store the updated data blocks 
in a newly allocated set of sectors and to deallocate or free the sectors storing the current group(s) of compressed 
data blocks to be updated. In this situation, steps 288-290 are performed. For example, if there is a desire to save the 
original group(s) of compressed data blocks, such original recording may be retained. Host processor 11 then updates 
the appropriate file directory 160-162 and exits the storage operation. 

[0082] In the updating operation shown in Fig. 15, whenever the compressed data has more bytes than the original 
uncompressed data, the data are recorded in an uncompressed form, the steps shown in Fig. 11 are added to the Fig. 
15 illustrated sequence. 



Claims 

1 . Apparatus for storing data in compressed form in a data storage device having a plurality of addressable like-sized 
data storage areas, each for recording a predetermined number of data bytes, the data storage device being 
connected to means for receiving data to be recorded, said received data being arranged in a plurality of address- 
able data blocks, characterised in that the apparatus comprises, in combination: 

selection means in the means for receiving data for selecting one or more data transfer units of data blocks 
to be recorded, each said transfer unit of data blocks having a given number of data bytes and including one 
or more of said addressable data blocks; 

allocation means connected to the selection means for responding to the number of data bytes in each said 
transfer unit of data blocks to determine, based on said number of data bytes, a required first number of 
addressable data storage areas for storing said transfer unit of data blocks, and to indicate that said transfer 
unit requires said first number of addressable data storage areas to record said transfer unit; 

compression means connected to the selection means for compressing said transfer unit of data blocks to be 
recorded as a group of compressed data blocks; 

data access means in the device connected to said compression means for recording said group of com- 
pressed data blocks in a second number of said addressable data storage areas as one continuum of com- 
pressed data, said second number being equal to or less than said first number; and 

directory means indicating which ones of said addressable data storage areas said continuum of data is re- 
corded in, and indicating that said continuum of data contains said selected transfer unit of data blocks in a 
compressed form. 

2. Apparatus according to claim 1 including update means connected to said selection means and to said allocation 
means for updating a recorded group of compressed data blocks with updated data blocks, including receiving 
updated ones of said data blocks and selecting a data transfer unit of data blocks to include said updated data 
blocks; said update means being connected to said allocation means for allocating a number of said addressable 
data storage areas for receiving and recording said updated compressed data blocks and for deallocating ones of 
said allocated addressable data storage areas in which are stored the original group of compressed data blocks 
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to be updated. 

Apparatus according to claim 1 or claim 2, wherein the selection means is connected to the data storage device 
for selecting said given number of data bytes to form a transfer unit in dependence on the number of data bytes 
which are recordable in each of said data storage areas. 

Apparatus according to any one of claims 1 to 3, including range means indicating a range of number of bytes to 
be used for transferring data blocks between said means for receiving data and said data storage device, wherein 
said selection means is connected to said range means for receiving said range indication and responding to the 
received range indication for selecting a predetermined number of said data blocks to be a data transfer unit and 
to be in one of said groups of data blocks such that each said group of data blocks has a number of data bytes 
equivalent to a plurality of uncompressed data blocks. 

Apparatus according to any one of claims 1 to 4 wherein the apparatus includes: 
CKD means for supplying a plurality of CKD data blocks; 

a CKD formatted disk within the data storage device for receiving and recording CKD data; 

said selection means being connected to said CKD means for receiving and selecting a predetermined number 
of said CKD data blocks as a data transfer unit of said CKD data blocks; 

said data access means having CKD recording means for recording said transfer unit of CKD data as com- 
pressed by said compression means as a single record on said CKD formatted disk; and 

repeat means connected to said selection means and to said CKD means for repeatedly actuating the CKD 
means to supply a transfer unit of CKD data blocks for compression and recording in respective single CKD 
records. 

Apparatus according to any one of claims 1 to 4, including: 

a host processor connected to a peripheral controller, said data storage device being connected to said pe- 
ripheral controller; 

an FBA sectored disk in said data storage device having a plurality of addressable sectors for receiving and 
recording data blocks; 

said selection means having means for selecting said data blocks for said data transfer unit to be recorded in 
a predetermined number of sectors on said FBA sectored disk; and 

repeat means connected to said selection means for repeatedly actuating the selection means for selecting 
a plurality of said transfer units of data blocks from one file of such data blocks for compression and recording 
of said transfer units of data blocks such that said file of data blocks is recorded in compressed form on said 
FBA sectored disk in a plurality of said continuum of data wherein each said continua consists of one said 
group of compressed data blocks. 

Apparatus according to any one of the preceding claims, including data recording management means connected 
to said directory means and to said data access means for actuating the directory means to establish a plurality 
of said file directories, one file directory for each file of data recorded in compressed form; said recording man- 
agement means actuating said directory means to record in each of said file directories a number of said data 
blocks to be included in each of said data transfer units of data and including recording a maximum number of 
bytes to be included in any one of said data transfer units. 

A method of compressing and recording onto a data storage medium data of a file which is arranged in a plurality 
of addressable data blocks, characterised in that the method comprises the steps of: 

selecting (10) a plurality of said data blocks of said file to be compressed and recorded; 
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segmenting the selected plurality of addressable data blocks into one or more data transfer units, each data 
transfer unit of data blocks having a given number of data bytes and including one or more of said addressable 
data blocks; 

5 allocating (13) a first number of addressable data storage areas of the storage medium for recording each of 

said one or more data transfer units as respective separate groups of compressed data blocks, said first 
number being determined based on the number of data bytes in each of said one or more data transfer units; 

compressing (15) each of said one or more data transfer units and recording them as respective separate 
10 groups of compressed data blocks in a second number of addressable data storage areas of the storage 

medium as one continuum of compressed data for each group, said second number being equal to or less 
than said first number; 

creating and maintaining (16) a file directory indicating the address and size of each of said recorded groups 
15 for enabling random access to recorded data within said file of data blocks. 

9. A method according to claim 8 including the step, after recording said one group of compressed data blocks, of 
deallocating (21 9) allocated addressable data storage areas, if any, into which said one group of compressed data 
blocks was not recorded. 

20 

10. A method according to claim 8 or claim 9, including: 

supplying CKD formatted data blocks of one CKD formatted file and selecting said CKD data blocks to be 
compressed and recorded; 

25 

compressing one or more data transfer units of said CKD data blocks into one or more groups, respectively, 
of compressed CKD data blocks; and 

recording the one or more groups of compressed CKD data blocks as one record on a CKD formatted record 
30 member. 

11 . A method according to claim 8 or claim 9, including: 

selecting an FBA formatted record medium to be said record medium, said FBA formatted record medium 
35 having a plurality of addressable data-storing sectors, each data-storing sector being capable of recording a 

given number of data bytes; and 

selecting said data transfer unit to have a first predetermined number of said data blocks having a number of 
uncompressed data bytes equal to a data storage capacity, in data bytes, of a second predetermined number 
to of said data-storing sectors. 

12. A method according to any one of claims 8 to 11 , including: 

setting a range of number of bytes to be included in each of said data transfer units; and 

45 

selecting a number of said data blocks such that a number of data bytes in the selected number of data blocks 
is within the set range. 

so Patentanspruche 

1 . Eine Vorrichtung zum Speichern von Daten in komprimierter Form in einem Datenspeichergerat mit einer Vielzahl 
von adressierbaren, gleich groBen Datenspeicherbereichen, wobei jeder fur die Aufzeichnung einer vordefinierten 
Anzahl an Datenbytes vorgesehen ist, wobei das Datenspeichergerat mit einem Mittei zum Empfang der aufzu- 
55 zeichnenden Daten verbunden ist, wobei die genannten empfangenen Daten in einer Vielzahl an adressierbaren 

Datenblocken angeordnet sind, dadurch gekennzeichnet, dass die Vorrichtung in Kombination folgendes um- 
fasst: 
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Ein Mittel zur Auswahl im Mittel zum Empfang der Daten, urn eine Oder mehrere Datenubertragungseinheiten 
der aufzuzeichnenden Datenbldcke auszuwahlen, wobei jede Datenubertragungseinheit uber eine gegebene 
Anzahl an Datenbytes verfugt und einen Oder mehrere der genannten adressierbaren DatenbI6cke enth< 

Ein Mittel zur Zuordnung, verbunden mit dem Mittel zur Auswahl, um fur die Anzahl an Datenbytes in jeder 
genannten Ubertragungseinheit mit Datenblocken, basierend auf der genannten Anzahl an Datenbytes, eine 
erforderliche erste Anzahl an adressierbaren Datenspeicherbereichen festzulegen, zum Speichern der ge- 
nannten Datenubertragungseinheit mit Datenblocken, und um anzuzeigen, dass die genannte Datenubertra- 
gungseinheit die genannte erste Anzahl an adressierbaren Datenspeicherbereichen benotigt, um die genannte 
Datenubertragungseinheit aufzuzeichnen; 

Ein Mittel zur Kompression, verbunden mit dem mittel zur Auswahl, zur Kompression der genannten Obertra- 
gungseinheit mit Datenblocken, die als eine Gruppe mit komprimierten DatenbtOcken aufzuzeichnen sind; 

Ein Mittel zum Datenzugriff in dem Gerat, das mit dem Mittel zur Kompression verbunden ist, um die genannte 
Gruppe mit komprimierten Datenblocken in einer zweiten Anzahl der genannten adressierbaren Datenspei- 
cherbereiche als zusammenhangende Einheit an komprimierten Daten aufzuzeichnen, wobei die genannte 
zweite Anzahl gleich oder kleiner als die genannte erste Anzahl ist; und 

Ein Verzeichnismittel zur Anzeige, in welchem der genannten adressierbaren Datenspeicherbereiche die ge- 
nannte zusammenhangende Dateneinheit aufgezeichnet wurde sowie zur Anzeige, dass die zusammenhan- 
gende Dateneinheit die genannte ausgewahlte Ubertragungseinheit mit Datenblocken in komorimierter Form 
enthait. 

Eine Vorrichtung nach Anspruch 1 , einschlieBlich einem Aktualisierungsmittel, verbunden mit dem genannten Aus- 
wahlmittel und dem genannten Zuordnungsmittel, zum Aktualisieren einer aufgezeichneten Gruppe mit kompri- 
mierten Datenblocken mit aktualisierten Datenblocken, einschlieBlich dem Empfang der aktualisierten genannten 
Datenblocke sowie der Auswahl einer Datenubertragungseinheit mit Datenblocken zur Einbeziehung der genann- 
ten aktualisierten Datenblocke, wobei das genannte Aktualisierungsmittel mit dem genannten Zuordnungsmittel 
verbunden ist, um eine Anzahl der genannten adressierbaren Datenspeicherbereiche zum Empfang und Aufzeich- 
nen der genannten aktualisierten komprimierten Datenblocke zuzuordnen und die Zuordnung derjenigen der zu- 
geordneten adressierbaren Datenspeicherbereiche riickgangig zu machen, in denen die ursprungliche Gruppe 
mit komprimierten, zu aktualisierenden Datenblocken enthalten ist. 

Eine Vorrichtung nach Anspruch 1 oder nach Anspruch 2, wobei das Mittel zur Auswahl mit dem Datenspeicher- 
gerat verbunden ist, um die genannte gegebene Anzahl an Datenbytes auszuwahlen, um eine Ubertragungseinheit 
zu bilden, abh&ngig von der Anzahl an Datenbytes, die in jedem Datenspeicherbereich aufgezeichnet werden kann . 

Eine Vorrichtung nach jedem der Anspruche 1 bis 3, einschlieBlich einem Mittel zur Angabe des Bereichs fur die 
Anzahl an Datenbytes, zur Verwendung fur die Ubertragung von Datenbldcken zwischen dem genannten Mittel 
zum Empfang von Daten und dem genannten Datenspeichergerat, wobei das genannte Mittel zur Auswahl mit 
dem genannten Mittel zur Angabe des Daten bytebereichs verbunden ist, um die genannte Bereichsangabe zu 
empfangen und um auf die empfangene Bereichsangabe zur Auswahl einer vordefinierten Anzahl an genannten 
Datenblocken zu reagieren, die eine Datenubertragungseinheit bilden und sich in einer der genannten Gruppen 
mit Datenblocken befinden, so dass jede der genannten Gruppen mit Datenbl6cken uber eine Anzahl an Daten- 
bytes verfugt, die mit der Vielzahl an nicht komprimierten Datenblocken ubereinstimmt. 

Eine Vorrichtung nach jedem der Anspruche 1 bis 4, wobei die Vorrichtung folgendes umfasst: 

Ein CKD Mittel zum Bereitstellen einer Vielzahl an CKD Datenblocken; 

Ein CKD-formatierter Datentrager innerhalb des Datenspeichergerats zum Empfang und Aufzeichnen von 
CKD Daten; 

Wobei das genannte Auswahlmittel mit dem genannten CKD Mittel verbunden ist, um eine vordefinierte Anzahl 
der genannten CKD Datenblocke als eine Datenubertragungseinheit der genannten Datenblocke zu empfan- 
gen und auszuwahlen; 



17 



EP 0 587 437 B1 



Wobei das genannte Datenzugriffsmittel uber ein CKD Aufzeichnungsmittel verfugt, um die genannte Ober- 
tragungseinheit mit CKD Daten aufzuzeichnen, die durch das genannte Kompressionsmittel als einzelner Da- 
tensatz auf dem genannten CKD-formatierten Datentrager komprimiert ist; und 

Ein Mittel zur Wiederholung, verbunden mit dem genannten Auswahlmittel und mit dem genannten CKD Mittel, 
um auf das CKD Mittel wiederholt zuzugreifen und so eine Ubertragungseinheit mit CKD Datenblocken zur 
Kompression und Aufzeichnung in entsprechenden einzelnen Datensatzen bereitzustellen. 

Eine Vorrichtung nach jedem der Anspruche 1 bis 4, einschlieBlich: 

Einem Hostprozessor, verbunden mit einem peripheren Controller, wobei das genannte Datenspeichergerat 
mit dem peripheren Controller verbunden ist; 

Einem in FBA-Sektoren unterteilten Datentrager mit einer Vielzahl an adressierbaren Sektoren zum Empfang 
und zum Aufzeichnen von DatenblScken; 

Wobei das genannte Auswahlmittel uber ein Mittel zum Auswahlen der genannten Datenblocke fur die ge- 
nannte Datenubertragungseinheit verfugt, die in vordefinierter Anzahl an Sektoren auf dem genannten in 
FBA-Sektoren unterteilten Datentrager aufzuzeichnen ist; und 

Einem Mittel zur Wiederholung, verbunden mit dem genannten Mittel zum wiederholten Zugriff auf das Aus- 
wahlmittel zur Auswahl einer Vielzahl der genannten Ubertragungseinheiten mit Datenblocken von einer Datei 
dieser Datenblocke zur Kompression und zur Aufzeichnung der genannten Ubertragungseinheiten mit Daten- 
bl6cken, so dass die genannte Datei mit Datenbl6cken in komprimierter Form auf dem in FBA-Sektoren un- 
terteilten Datentrager in einer Vielzahl an zusammenhangenden Dateneinheiten aufgezeichnet wird, wobei 
jede zusammenhangende Dateneinheit aus einer der genannten Gruppen mit komprimierten Datenblocken 
besteht. 

Eine Vorrichtung nach einem der vorangegangenen Anspruche, einschlieBlich einem Mittel zur Verwattung der 
Datenaufzeichnung, verbunden mit dem genannten Verzeichnismittel sowie mit dem genannten Datenzugriffsmit- 
tel fur den Zugriff auf das Verzeichnismittel zum Erstellen einer Vielzahl der genannten Dateiverzeichnisse, wobei 
ein Dateiverzeichnis fur jede Datei mit Daten in komprimierter Form bereitgestellt wird; wobei das genannte Mittel 
zum Verwalten der Aufzeichnung auf das genannte Verzeichnismittel zugreift, um in jedem der genannten Datei- 
verzeichnisse eine Anzahl der genannten Datenblocke aufzuzeichnen, die in jede der Datenubertragungseinheiten 
eingefugt werden sollen sowie einschlieBlich dem Aufzeichnen einer maximalen Anzahl an Bytes fur jede der 
Datenubertragungseinheiten. 

Eine Methode zur Kompression und Aufzeichnung von Daten einer Datei auf einem Datenspeichermedium, die in 
einer Vielzahl von adressierbaren Datenblocken angeordnet ist, dadurch gekennzeichnet, dass die Methode die 
folgenden Schritte umfasst: 

Auswahl (10) einer Vielzahl der genannten Datenblocke der genannten Datei zur Kompression und Aufzeich- 
nung; 

Segmentieren der ausgewahlten Vielzahl an adressierbaren Datenbldcken in eine Oder mehrere Datenuber- 
tragungseinheiten, wobei jede Datenubertragungseinheit uber eine gegebene Anzahl an Datenbytes verfugt 
und einen Oder mehrere der genannten Datenblocke umfasst; 

Zuordnen (13) einer ersten Anzahl an adressierbaren Datenspeicherbereichen des Speichermediums zur Auf- 
zeichnung jeder der genannten Datenubertragungseinheiten als entsprechende separate Gruppe mit kompri- 
mierten Datenblocken, wobei die genannte erste Anzahl basierend auf der Anzahl an Datenbytes in jeder der 
Datenubertragungseinheiten bestimmt wird; 

Komprimieren (15) jeder der genannten Datenubertragungseinheiten und deren Aufzeichnung als entspre- 
chende separate Gruppe mit komprimierten Datenblocken in einer zweiten Anzahl an adressierbaren Daten- 
speicherbereichen des Speichermediums als zusammenhangende, komprimierte Daten fur jede Gruppe, wo- 
bei die zweite Anzahl gleich Oder kleiner als die erste Anzahl ist; 



18 



EP 0 587 437 B1 



Erstellen und Verwalten (1 6) eines Dateiverzeichnisses, das die Adresse und die GroBe jeder der genannten 
aufgezeichneten Gruppen angibt, um einen Direktzu griff auf aufgezeichnete Daten innerhalb der genannten 
Datei mit Datenbldcken zu ermbglichen. 

s 9. Eine Methode nach Anspruch 8 einschlieBlich des Schritts, nach dem Aufzeichnen der genannten einen Gruppe 
mit komprimierten Datenbldcken, zum Ruckgangig machen der Zuordnung (219) der zugeordneten adressierbaren 
Datenspeicherbereiche, falls vorhanden, in denen die genannte Gruppe mit komprimierten Datenbldcken nicht 
aufgezeichnet wurde. 

10 10. Eine Methode nach Anspruch 8 Oder Anspruch 9, einschlieBlich: 

Bereitstellen von CKD-formatierten Datenbldcken einer CKD-formatierten Datei und Auswahlen der genann- 
ten CKD- Datenbldcke zur Kompression und zur Aufzeichnung; 

is Komprimieren von einer Oder mehreren Datenubertragungseinheiten der genannten CKD-Datenbldcke in eine 

Oder mehrere Gruppen mit komprimierten CKD- Datenbldcken; und 

Aufzeichnen der Gruppen mit komprimierten CKD- Datenblocken als ein Datensatz auf einem CKD- forma- 
tierten Datentrager. 

20 

11. Eine Methode nach Anspruch 8 oder 9, einschlieBlich: 

Auswahlen eines FBA-formatierten Aufzeichnungsmediums, wobei das genannte FBA-formatierte Aufzeich- 
nungsmedium uber eine Vielzahl an adressierbaren, Daten speichernden Sektoren verfugt, wobei jeder der 
25 Daten speichernden Sektoren in der Lage ist, eine gegebene Anzahl an Datenbytes aufzuzeichnen; und 

Auswahlen der genannten Datenubertragungseinheit, um eine erste vordefinierte Anzahl der genannten Da- 
tenbldcke zu haben, die uber eine Anzahl nicht komprimierter Datenbytes verfugen, die gleich einer Daten- 
speicherkapazitat, in Datenbytes, einer zweiten vordefinierten Anzahl der genannten Daten speichernden Sek- 
30 toren ist. 

12. Eine Methode nach jedem der Anspruche 8 bis 11, einschlieBlich: 

Setzen eines Anzahlbereichs an Bytes, die in jede Datenubertragungseinheit eingefugt werden konnen; und 

35 

Auswahlen einer Anzahl der genannten Datenbldcke, so dass sich eine Anzahl an Datenbytes innerhalb der 
ausgewahlten Anzahl an Datenbldcken innerhalb des festgelegten Bereichs bewegt. 



40 Revendications 

1. Dispositif destine a memoriser des donnees sous une forme compressee dans un dispositif de memorisation de 
donnees comportant une plurality de zones de memorisation de donnees de tallies identiques adressables, cha- 
cune etant destined a enregistrer un nombre predetermine d'octets de donnees, le dispositif de memorisation de 
45 donnees etant relie a un moyen destine a recevoir des donnees devant §tre enregistrees, lesdites donnees recues 

etant agencees dans une plurality de blocs de donnees adressables, caracterise en ce que le dispositif comprend, 
en combinaison : 

un moyen de selection dans le moyen destine a recevoir des donnees afin de seiectionner une ou plusieurs 
so unites de transfert de donnees des blocs de donnees a enregistrer, chaque dite unite de transfert de blocs de 

donnees comportant un nombre donne d'octets de donnees et comprenant un ou plusieurs desdits blocs de 
donnees adressables, 

un moyen d'allocation relie au moyen de selection afin de repondre au nombre des octets de donnees dans 
55 chaque dite unite de transfert de blocs de donnees pour determiner sur la base dudit nombre d'octets de 

donnees, un premier nombre requis de zones de memorisation de donnees adressables destinees a memo- 
riser ladite unite de transfert de blocs de donnees, et pour indiquer que ladite unite de transfert demande ledit 
premier nombre de zones de memorisation de donnees adressables pour enregistrer ladite unite de transfert, 
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un moyen de compression relie au moyen de selection en vue de compresser ladite unite de transfert de blocs 
de donnees k enregistrer sous forme d'un groupe de blocs de donn6es compresses, 

un moyen d'acces aux donnees dans le dispositif relie audit moyen de compression afin d'enregistrer ledit 
groupe de blocs de donnees compressees dans un second nombre desdites zones de memorisation de don- 
nees adressabies sous forme d'une suite continue de donnees compressees, ledit second nombre etant in- 
ferieur ou egal audit premier nombre, et 

un moyen de repertoire indiquant dans lesquels desdites zones de memorisation de donnees adressabies, 
ladite suite continue de donnees est enregistrge, et indiquer que ladite suite continue de donnees contient 
ladite unite de transfert seiectionnee des blocs de donnees sous une forme compressee. 

Dispositif selon la revendication 1 , comprenant un moyen de mise k jour relie audit moyen de selection et audit 
moyen d'allocation afin de mettre k jour un groupe enregistre de blocs de donnees compressees avec des blocs 
de donnees mis k jour, comprenant la reception des blocs mis k jour parmi lesdits blocs de donnees et la selection 
d'une unite de transfert de donnees des blocs de donnees pour inclure lesdits blocs de donnees mis k jour, ledit 
moyen de mise k jour etant relie audit moyen d'allocation afin d'allouer un certain nombre desdites zones de 
memorisation de donnees adressabies en vue de recevoir et d'enregistrer lesdits blocs de donnees compressees 
mis k jour et en vue de dgsallouer celles desdites zones de memorisation de donnees adressabies allouees dans 
lesquelles est memorise le groupe d'origine des blocs de donnees compressees devant §tre mis k jour. 

Dispositif selon la revendication 1 ou la revendication 2, dans lequel le moyen de selection est relie au dispositif 
de memorisation de donnees afin de seiectionner ledit nombre donne d'octets de donnees pour eiaborer une unite 
de transfert suivant le nombre des octets de donnees qui sont enregistrables dans chacune desdites zones de 
memorisation de donnees. 

Dispositif selon Tune quelconque des revendications 1 k 3, comprenant un moyen de plage indiquant une plage 
d'un nombre d'octets k utiliser pour transferer des blocs de donnees entre lesdits moyens en vue de recevoir des 
donnees et ledit moyen de memorisation de donnees, dans lequel ledit moyen de selection est relie audit moyen 
de plage afin de recevoir ladite indication de plage et repondant k I'indication de plage recue pour seiectionner un 
nombre predetermine desdits blocs de donnees devant constituer une unite de transfert de donnees et devant 
figurer dans I'un desdits groupes de blocs de donnees de sorte que chaque dit groupe de blocs de donnees 
comporte un nombre d'octets de donnees equivalent & une plurality de blocs de donnees non compressees. 

Dispositif selon Tune quelconque des revendications 1 k 4 dans lequel le dispositif comprend : 

un moyen de format CKD destine k fournir une plurality de blocs de donnees au format CKD, 

un disque formate au format CKD k I'interieur du dispositif de memorisation de donnees afin de recevoir et 
d'enregistrer des donnees au format CKD, 

ledit moyen de selection etant relie audit moyen de format CKD pour recevoir et seiectionner un nombre 
predetermine desdits blocs de donnees au format CKD sous forme d'une unite de transfert de donnees desdits 
blocs de donnees au format CKD, 

ledit moyen d'acces aux donnees comportant un moyen d'enregistrement au format CKD destine ei enregistrer 
ladite unite de transfert des donnees au format CKD telles qu'elles sont compressees par ledit moyen de 
compression sous forme d'un enregistrement unique sur ledit disque formate au format CKD, et 

un moyen de repetition relie audit moyen de selection et audit moyen de format CKD destine k actionner de 
facon repetitive le moyen de format CKD afin de fournir une unite de transfert des blocs de donnees au format 
CKD en vue d'une compression et d'un enregistrement dans des enregistrements au format CKD uniques 
respectifs. 

Dispositif selon Tune quelconque des revendications 1 k 4 comprenant : 

un processeur hdte relie k un contrdleur de peripherique, ledit dispositif de memorisation de donnees etant 
relie audit contrdleur de peripherique, 



20 



EP 0 587 437 B1 



un disque a secteurs a architecture FBA dans ledit dispositif de memorisation de donnees comportant une 
plurality de secteurs adressables pour recevoir et enregistrer des blocs de donnees, 

ledit moyen de selection comportant un moyen destine a seiectionner lesdits blocs de donnees pour ladite 
unite de transfert de donnees devant dtre enregistree dans un nombre predetermine de secteurs sur ledit 
disque a secteurs a architecture FBA, et 

un moyen de repetition relie audit moyen de selection af in d'actionner de fagon repetitive le moyen de selection 
en vue de seiectionner une plurality desdites unites de transfert de blocs de donnees a parti r d'un fichier de 
tels blocs de donnees en vue d'une compression et d'un enregistrement desdites unites de transfert des blocs 
de donnees de sorte que ledit fichier de blocs de donn6es soit enregistre sous forme compressee sur ledit 
disque a secteurs a architecture FBA dans une plurality de ladite suite continue de donnees dans lequel 
chacune dites suites continues est constitu6e d'un bloc dudit groupe de blocs de donnees compressees. 

7. Dispositif selon Tune quelconque des revendications precedentes, comprenant un moyen de gestion d'enregis- 
trement de donnees relie audit moyen de repertoire et audit moyen d'accds aux donnees afin d'actionner le moyen 
de repertoire pour etablir une plurality desdits repertoires de fichiers, un repertoire de fichiers pour chaque fichier 
des donnees enregistrees sous forme compressee, ledit moyen de gestion d'enregistrement actionnant ledit moyen 
de repertoire pour enregistrer dans chacun desdits repertoires de fichiers un certain nombre desdits blocs de 
donnees devant £tre inclus dans chacune desdites unites de transfert des donnees et comprenant I'enregistrement 
d'un nombre maximum d 'octets devant §tre inclus dans Tune quelconque desdites unites de transfert de donnees. 

8. Procede de compression et d'enregistrement sur un support de memorisation de donnees, des donnees d'un 
fichier qui sont disposes dans une plurality de blocs de donnees adressables, caracterise en ce que le procede 
comprend les etapes consistant a : 

seiectionner (1 0) une plurality desdits blocs de donnees dudit fichier devant etre compressees et enregistrees, 

segmenter la plurality seiectionnee des blocs de donnees adressables en une ou plusieurs unites de transfert 
de donnees, chaque unite de transfert de donnees des blocs de donnees comportant un nombre donne d'octets 
de donnees et comprenant un ou plusieurs desdits blocs de donnees adressables, 

allouer (13) un premier nombre de zones de memorisation de donnees adressables du support de memori- 
sation afin d'enregistrer chacune desdites une ou plusieurs unites de transfert de donnees sous forme de 
groupes separes respectifs de blocs de donnees compressees, ledit premier nombre etant determine sur la 
base du nombre des octets de donnees dans chacune desdites une ou plusieurs unites de transfert de don- 
nees, 

compresser (15) chacune desdites une ou plusieurs unites de transfert de donnees et les enregistrer sous 
forme de groupes separes respectifs de blocs de donnees compressees dans un second nombre de zones 
de memorisation de donnees adressables du support de memorisation, sous forme d'une suite continue de 
donnees compressees pour chaque groupe, ledit second nombre etant inferieur ou egal audit premier nombre, 

creer et entretenir (16) un repertoire de fichiers indiquant I'adresse et la taille de chacun desdits groupes 
enregistres afin de permettre un acces aieatoire aux donnees enregistrees a I'interieur dudit fichier de blocs 
de donnees. 

9. Procede selon la revendication 8, comprenant retape, apres I'enregistrement dudit un groupe de blocs de donnees 
compressees, consistant a desallouer (219) des zones de memorisation de donnees adressables allouees, s'il en 
existe, dans lesquels ledit un groupe de blocs de donnees compressees n'a pas 6te enregistre. 

10. Procede selon la revendication 8 ou la revendication 9, comprenant : 

la fourniture de blocs de donnees formates au format CKD d'un fichier formate au format CKD et la selection 
desdits blocs de donnees au format CKD devant §tre compressees et enregistrees, et 

la compression d'une ou plusieurs unite de transfert de donnees desdits blocs de donnees au format CKD en 
un ou plusieurs groupes, respectivement, de blocs de donnees au format CKD compressees, et 
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i'enregistrement des un ou plusieurs groupes de blocs de donnees au format CKD compressees sous forme 
d'un enregistrement sur un element d'enregistrement formats au format CKD. 

11. Procede selon la revendication 8 ou la revendication 9, comprenant : 

5 

la selection d'un support d'enregistrement formate a architecture FBA devant constituer ledit support d'enre- 
gistrement, ledit support d'enregistrement formate a architecture FBA comportant une plurality de secteurs 
de memorisation de donnees adressables, chaque secteur de memorisation de donnees etant capable d'en- 
registrer un nombre donne d'octets de donnees, et 

10 

la selection de ladite unite de transfert pour qu'elle presente un premier nombre predetermine desdits blocs 
de donnees comportant un nombre d'octets de donnees non compressees egal a une capacite de memori- 
sation de donnees, en octets de donnees, d'un second nombre predetermine desdits secteurs de memorisation 
de donnees, 

15 

12. Procede selon I'une quelconque des revendications 8 a 11 , comprenant : 

retablissement d'une plage du nombre des octets devant etre inclus dans chacune desdites unites de transfert 
de donnees, et 

20 

la selection d'un nombre desdits blocs de donnees tel qu'un nombre d'octets de donnees dans le nombre 
seiectionne des blocs de donnees soit a Tinterieur de la plage etablie. 
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